Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Survey and bibliography of Arabic optical text recognition

Identifieur interne : 002927 ( Main/Exploration ); précédent : 002926; suivant : 002928

Survey and bibliography of Arabic optical text recognition

Auteurs : Badr Al-Badr [États-Unis] ; Sabri A. Mahmoud [Arabie saoudite]

Source :

RBID : ISTEX:DC4EB8DB46AAB0EBB61B145C277FF2CB670062F8

Abstract

Research work on Arabic optical text recognition (AOTR), although lagging that of other languages, is becoming more intensive than before and commercial systems for AOTR are becoming available. This paper presents a comprehensive survey and bibliography of research on AOTR, by covering all the research publications on AOTR to which the authors had access. This paper introduces the general topic of optical character recognition (OCR), and highlights the characteristics of Arabic text. It also presents an historical review of the Arabic text recognition systems. Further, this paper reports on the state of the art in AOTR research, and lists the specifications of commercially available systems for AOTR. In this paper, we first underline the capabilities of different AOTR systems, and then introduce a five stage model for AOTR systems and classify research work according to this model. We devote a section to each of the stages of this model: preprocessing, segmentation, feature extraction, classification, and post-processing. In the preprocessing section, we emphasize handling degraded documents, and thinning of Arabic text. In the segmentation section, we discuss methods of segmenting Arabic text and categorize the methods into five general approaches. In the feature extraction and classification sections, we highlight the main techniques and analyze AOTR research works based on those techniques. We then discuss approaches for post-processing and show their relation to the Arabic language. We conclude by pointing problems and directions for future research on AOTR.

Url:
DOI: 10.1016/0165-1684(94)00090-M


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Survey and bibliography of Arabic optical text recognition</title>
<author>
<name sortKey="Al Badr, Badr" sort="Al Badr, Badr" uniqKey="Al Badr B" first="Badr" last="Al-Badr">Badr Al-Badr</name>
</author>
<author>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:DC4EB8DB46AAB0EBB61B145C277FF2CB670062F8</idno>
<date when="1995" year="1995">1995</date>
<idno type="doi">10.1016/0165-1684(94)00090-M</idno>
<idno type="url">https://api.istex.fr/document/DC4EB8DB46AAB0EBB61B145C277FF2CB670062F8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000032</idno>
<idno type="wicri:Area/Istex/Curation">000032</idno>
<idno type="wicri:Area/Istex/Checkpoint">001C84</idno>
<idno type="wicri:doubleKey">0165-1684:1995:Al Badr B:survey:and:bibliography</idno>
<idno type="wicri:Area/Main/Merge">002A84</idno>
<idno type="wicri:Area/Main/Curation">002927</idno>
<idno type="wicri:Area/Main/Exploration">002927</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Survey and bibliography of Arabic optical text recognition</title>
<author>
<name sortKey="Al Badr, Badr" sort="Al Badr, Badr" uniqKey="Al Badr B" first="Badr" last="Al-Badr">Badr Al-Badr</name>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
<affiliation wicri:level="4">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science and Engineering, Mail Stop FR-35, University of Washington, Seattle, WA 98195</wicri:regionArea>
<placeName>
<region type="state">Washington (État)</region>
<settlement type="city">Seattle</settlement>
</placeName>
<orgName type="university">Université de Washington</orgName>
</affiliation>
</author>
<author>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Arabie saoudite</country>
<wicri:regionArea>Computer Engineering Department, College of Computers and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543</wicri:regionArea>
<wicri:noRegion>Riyadh 11543</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Signal Processing</title>
<title level="j" type="abbrev">SIGPRO</title>
<idno type="ISSN">0165-1684</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1994">1994</date>
<biblScope unit="volume">41</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="49">49</biblScope>
<biblScope unit="page" to="77">77</biblScope>
</imprint>
<idno type="ISSN">0165-1684</idno>
</series>
<idno type="istex">DC4EB8DB46AAB0EBB61B145C277FF2CB670062F8</idno>
<idno type="DOI">10.1016/0165-1684(94)00090-M</idno>
<idno type="PII">0165-1684(94)00090-M</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0165-1684</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Research work on Arabic optical text recognition (AOTR), although lagging that of other languages, is becoming more intensive than before and commercial systems for AOTR are becoming available. This paper presents a comprehensive survey and bibliography of research on AOTR, by covering all the research publications on AOTR to which the authors had access. This paper introduces the general topic of optical character recognition (OCR), and highlights the characteristics of Arabic text. It also presents an historical review of the Arabic text recognition systems. Further, this paper reports on the state of the art in AOTR research, and lists the specifications of commercially available systems for AOTR. In this paper, we first underline the capabilities of different AOTR systems, and then introduce a five stage model for AOTR systems and classify research work according to this model. We devote a section to each of the stages of this model: preprocessing, segmentation, feature extraction, classification, and post-processing. In the preprocessing section, we emphasize handling degraded documents, and thinning of Arabic text. In the segmentation section, we discuss methods of segmenting Arabic text and categorize the methods into five general approaches. In the feature extraction and classification sections, we highlight the main techniques and analyze AOTR research works based on those techniques. We then discuss approaches for post-processing and show their relation to the Arabic language. We conclude by pointing problems and directions for future research on AOTR.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Arabie saoudite</li>
<li>États-Unis</li>
</country>
<region>
<li>Washington (État)</li>
</region>
<settlement>
<li>Seattle</li>
</settlement>
<orgName>
<li>Université de Washington</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<noRegion>
<name sortKey="Al Badr, Badr" sort="Al Badr, Badr" uniqKey="Al Badr B" first="Badr" last="Al-Badr">Badr Al-Badr</name>
</noRegion>
<name sortKey="Al Badr, Badr" sort="Al Badr, Badr" uniqKey="Al Badr B" first="Badr" last="Al-Badr">Badr Al-Badr</name>
</country>
<country name="Arabie saoudite">
<noRegion>
<name sortKey="Mahmoud, Sabri A" sort="Mahmoud, Sabri A" uniqKey="Mahmoud S" first="Sabri A." last="Mahmoud">Sabri A. Mahmoud</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002927 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002927 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:DC4EB8DB46AAB0EBB61B145C277FF2CB670062F8
   |texte=   Survey and bibliography of Arabic optical text recognition
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024